D++: Structural credit assignment in tightly coupled multiagent domains

نویسندگان

  • Aida Rahmattalabi
  • Jen Jen Chung
  • Mitchell K. Colby
  • Kagan Tumer
چکیده

Autonomous multiagent teams can be used in complex exploration tasks to both expedite the exploration and improve the efficiency. However, use of multiagent systems presents additional challenges. Specifically, in domains where the agents' actions are tightly coupled, coordinating multiple agents to achieve cooperative behavior at the group level is difficult. In this work, we demonstrate that reward shaping can greatly benefit learning in tightly coupled multiagent exploration tasks. We argue that in tightly coupled domains, effective coordination depends on rewarding stepping stone actions, actions that would improve system's objective but are not rewarded because other agents have not yet found their proper actions. To this end, we build up on the current work in multiagent structural credit assignment literature and we extend the idea of counterfactuals introduced in difference evaluation functions. Difference evaluation functions have a number of properties that make them ideal as learning signal, such as sensitivity to agent's actions and alignment with the global system objective. However, they fail to tackle the coordination problem in domains where the agent coupling is tight. Extending the idea of counterfactuals, we propose a novel reward structure, D++. We investigate the performance of the D++ in two different multiagent domains. We show that while both global team performance and the difference evaluation function fail to properly reward the stepping stone actions, our proposed algorithm successfully rewards such behaviors and provides superior performance (166% performance improvement and a quadruple convergence speed up) compared to policies learned using either the global reward or the difference reward. Tuesday, August 2, 2016 10:00 AM, Rogers 226 School of Mechanical, Industrial, and Manufacturing Engineering

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-objective Multiagent Credit Assignment Through Difference Rewards in Reinforcement Learning

Multiagent systems have had a powerful impact on the real world. Many of the systems it studies (air traffic, satellite coordination, rover exploration) are inherently multi-objective, but they are often treated as single-objective problems within the research. A very important concept within multiagent systems is that of credit assignment: clearly quantifying an individual agent’s impact on th...

متن کامل

Using communication to reduce locality in distributed multiagent learning

This paper attempts to bridge the elds of machine learning, robotics, and distributed AI. It discusses the use of communication in reducing the undesirable eeects of locality in fully distributed multi-agent systems with multiple agents/robots learning in parallel while interacting with each other. Two key problems, hidden state and credit assignment, are addressed by applying local undirected ...

متن کامل

A Replicator Dynamics Analysis of Difference Evaluation Functions

A key difficulty in Cooperative Coevolutionary Algorithms (CCEAs) is the credit assignment problem[1]. One solution to the credit assignment problem is the difference evaluation function, which produces excellent results in many multiagent domains. However, to date, there has been no prescriptive theoretical analysis deriving conditions under which difference evaluations improve the probability...

متن کامل

Multi-objective multiagent credit assignment in reinforcement learning and NSGA-II

Multiagent systems have had a powerful impact on the real world. Many of the systems it studies (air traffic, satellite coordination, rover exploration) are inherently multi-objective, but they are often treated as single-objective problems within the research. A key concept within multiagent systems is that of credit assignment: quantifying an individual agent’s impact on the overall system pe...

متن کامل

Multi-Objective Multiagent Credit Assignment in NSGA-II Using Difference Evaluations

Determining the contribution of an agent to a system-level objective function (credit assignment) is a key area of research in cooperative multiagent systems. Multi-objective optimization is a growing area of research, though mostly focused on single agent settings. Many real-world problems are multiagent and multi-objective, (e.g., air traffic management, scheduling observations across multipl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016